33 research outputs found
Metric Learning for Temporal Sequence Alignment
In this paper, we propose to learn a Mahalanobis distance to perform
alignment of multivariate time series. The learning examples for this task are
time series for which the true alignment is known. We cast the alignment
problem as a structured prediction task, and propose realistic losses between
alignments for which the optimization is tractable. We provide experiments on
real data in the audio to audio context, where we show that the learning of a
similarity measure leads to improvements in the performance of the alignment
task. We also propose to use this metric learning framework to perform feature
selection and, from basic audio features, build a combination of these with
better performance for the alignment
Looking Deeper into Tabular LIME
Interpretability of machine learning algorithms is an urgent need. Numerous
methods appeared in recent years, but do their explanations make sense? In this
paper, we present a thorough theoretical analysis of one of these methods,
LIME, in the case of tabular data. We prove that in the large sample limit, the
interpretable coefficients provided by Tabular LIME can be computed in an
explicit way as a function of the algorithm parameters and some expectation
computations related to the black-box model. When the function to explain has
some nice algebraic structure (linear, multiplicative, or sparsely depending on
a subset of the coordinates), our analysis provides interesting insights into
the explanations provided by LIME. These can be applied to a range of machine
learning models including Gaussian kernels or CART random forests. As an
example, for linear functions we show that LIME has the desirable property to
provide explanations that are proportional to the coefficients of the function
to explain and to ignore coordinates that are not used by the function to
explain. For partition-based regressors, on the other side, we show that LIME
produces undesired artifacts that may provide misleading explanations.Comment: 63 pages, 16 figure
Explaining the Explainer: A First Theoretical Analysis of LIME
Machine learning is used more and more often for sensitive applications,
sometimes replacing humans in critical decision-making processes. As such,
interpretability of these algorithms is a pressing need. One popular algorithm
to provide interpretability is LIME (Local Interpretable Model-Agnostic
Explanation). In this paper, we provide the first theoretical analysis of LIME.
We derive closed-form expressions for the coefficients of the interpretable
model when the function to explain is linear. The good news is that these
coefficients are proportional to the gradient of the function to explain: LIME
indeed discovers meaningful features. However, our analysis also reveals that
poor choices of parameters can lead LIME to miss important features.Comment: Accepted to AISTATS 202
The Risks of Recourse in Binary Classification
Algorithmic recourse provides explanations that help users overturn an
unfavorable decision by a machine learning system. But so far very little
attention has been paid to whether providing recourse is beneficial or not. We
introduce an abstract learning-theoretic framework that compares the risks
(i.e. expected losses) for classification with and without algorithmic
recourse. This allows us to answer the question of when providing recourse is
beneficial or harmful at the population level. Surprisingly, we find that there
are many plausible scenarios in which providing recourse turns out to be
harmful, because it pushes users to regions of higher class uncertainty and
therefore leads to more mistakes. We further study whether the party deploying
the classifier has an incentive to strategize in anticipation of having to
provide recourse, and we find that sometimes they do, to the detriment of their
users. Providing algorithmic recourse may therefore also be harmful at the
systemic level. We confirm our theoretical findings in experiments on simulated
and real-world data. All in all, we conclude that the current concept of
algorithmic recourse is not reliably beneficial, and therefore requires
rethinking.Comment: 22 pages, 6 figures, 3 tables
Consistent change-point detection with kernels
International audienceIn this paper we study the kernel change-point algorithm (KCP) proposed by Arlot, Celisse and Harchaoui (2012), which aims at locating an unknown number of change-points in the distribution of a sequence of independent data taking values in an arbitrary set. The change-points are selected by model selection with a penalized kernel empirical criterion. We provide a non-asymptotic result showing that, with high probability, the KCP procedure retrieves the correct number of change-points, provided that the constant in the penalty is well-chosen; in addition, KCP estimates the change-points location at the optimal rate. As a consequence, when using a characteristic kernel, KCP detects all kinds of change in the distribution (not only changes in the mean or the variance), and it is able to do so for complex structured data (not necessarily in ). Most of the analysis is conducted assuming that the kernel is bounded; part of the results can be extended when we only assume a finite second-order moment
Interpretable Prediction of Post-Infarct Ventricular Arrhythmia using Graph Convolutional Network
International audienceHeterogeneity of left ventricular (LV) myocardium infarction scar plays an important role as anatomical substrate in ventricular arrhythmia (VA) mechanism. LV myocardium thinning, as observed on cardiac computed tomography (CT), has been shown to correlate with LV myocardial scar and with abnormal electrical activity. In this project, we propose an automatic pipeline for VA prediction, based on CT images, using a Graph Convolutional Network (GCN). The pipeline includes the segmentation of LV masks from the input CT image, the short-axis orientation reformatting, LV myocardium thickness computation and mid-wall surface mesh generation. An average LV mesh was computed and fitted to every patient in order to use the same number of vertices with point-to-point correspondence. The GCN model was trained using the thickness value as the node feature and the atlas edges as the adjacency matrix. This allows the model to process the data on the 3D patient anatomy and bypass the âgridâ structure limitation of the traditional convolutional neural network. The model was trained and evaluated on a dataset of 600 patients (27% VA), using 451 (3/4) and 149 (1/4) patients as training and testing data, respectively. The evaluation results showed that the graph model (81% accuracy) outperformed the clinical baseline (67%), the left ventricular ejection fraction, and the scar size (73%). We further studied the interpretability of the trained model using LIME and integrated gradients and found promising results on the personalised discovering of the specific regions within the infarct area related to the arrhythmogenesis
Design status of ASPIICS, an externally occulted coronagraph for PROBA-3
The "sonic region" of the Sun corona remains extremely difficult to observe with spatial resolution and sensitivity sufficient to understand the fine scale phenomena that govern the quiescent solar corona, as well as phenomena that lead to coronal mass ejections (CMEs), which influence space weather. Improvement on this front requires eclipse-like conditions over long observation times. The space-borne coronagraphs flown so far provided a continuous coverage of the external parts of the corona but their over-occulting system did not permit to analyse the part of the white-light corona where the main coronal mass is concentrated. The proposed PROBA-3 Coronagraph System, also known as ASPIICS (Association of Spacecraft for Polarimetric and Imaging Investigation of the Corona of the Sun), with its novel design, will be the first space coronagraph to cover the range of radial distances between ~1.08 and 3 solar radii where the magnetic field plays a crucial role in the coronal dynamics, thus providing continuous observational conditions very close to those during a total solar eclipse. PROBA-3 is first a mission devoted to the in-orbit demonstration of precise formation flying techniques and technologies for future European missions, which will fly ASPIICS as primary payload. The instrument is distributed over two satellites flying in formation (approx. 150m apart) to form a giant coronagraph capable of producing a nearly perfect eclipse allowing observing the sun corona closer to the rim than ever before. The coronagraph instrument is developed by a large European consortium including about 20 partners from 7 countries under the auspices of the European Space Agency. This paper is reviewing the recent improvements and design updates of the ASPIICS instrument as it is stepping into the detailed design phase
Détection de ruptures et méthodes à noyaux
In this thesis, we focus on a method for detecting abrupt changes in a sequence of independent observations belonging to an arbitrary set on which a positive semidefinite kernel is defined. That method, kernel changepoint detection, is a kernelized version of a penalized least-squares procedure. Our main contribution is to show that, for any kernel satisfying some reasonably mild hypotheses, this procedure outputs a segmentation close to the true segmentation with high probability. This result is obtained under a bounded assumption on the kernel for a linear penalty and for another penalty function, coming from model selection.The proofs rely on a concentration result for bounded random variables in Hilbert spaces and we prove a less powerful result under relaxed hypothesesâa finite variance assumption. In the asymptotic setting, we show that we recover the minimax rate for the change-point locations without additional hypothesis on the segment sizes. We provide empirical evidence supporting these claims. Another contribution of this thesis is the detailed presentation of the different notions of distances between segmentations. Additionally, we prove a result showing these different notions coincide for sufficiently close segmentations.From a practical point of view, we demonstrate how the so-called dimension jump heuristic can be a reasonable choice of penalty constant when using kernel changepoint detection with a linear penalty. We also show how a key quantity depending on the kernelthat appears in our theoretical results influences the performance of kernel change-point detection in the case of a single change-point. When the kernel is translationinvariant and parametric assumptions are made, it is possible to compute this quantity in closed-form. Thanks to these computations, some of them novel, we are able to study precisely the behavior of the maximal penalty constant. Finally, we study the median heuristic, a popular tool to set the bandwidth of radial basis function kernels. Fora large sample size, we show that it behaves approximately as the median of a distribution that we describe completely in the setting of kernel two-sample test and kernel change-point detection. More precisely, we show that the median heuristic is asymptotically normal around this value.Dans cette thĂšse, nous nous intĂ©ressons Ă une mĂ©thode de dĂ©tection des ruptures dans une suite dâobservations appartenant Ă un ensemble muni dâun noyau semi-dĂ©fini positif. Cette procĂ©dure est une version « Ă noyaux » dâune mĂ©thode des moindres carrĂ©s pĂ©nalisĂ©s. Notre principale contribution est de montrer que, pour tout noyau satisfaisant des hypothĂšses raisonnables, cette mĂ©thode fournit une segmentation proche de la vĂ©ritable segmentation avec grande probabilitĂ©. Ce rĂ©sultat est obtenu pour un noyau bornĂ© et une pĂ©nalitĂ© linĂ©aire, ainsi quâune autre pĂ©nalitĂ© venant de la sĂ©lection de modĂšles. Les preuves reposent sur un rĂ©sultat de concentration pour des variables alĂ©atoires bornĂ©es Ă valeurs dans un espace de Hilbert, et nous obtenons une version moins prĂ©cise de ce rĂ©sultat lorsque lâon supposeseulement que la variance des observations est finie. Dans un cadre asymptotique, nous retrouvons les taux minimax usuels en dĂ©tection de ruptures lorsquâaucune hypothĂšse nâest faite sur la taille des segments. Ces rĂ©sultats thĂ©oriques sont confirmĂ©s par des simulations. Nous Ă©tudions Ă©galement de maniĂšre dĂ©taillĂ©e les liens entre diffĂ©rentes notions de distances entre segmentations. En particulier, nous prouvons que toutes ces notions coĂŻncident pour des segmentations suffisamment proches. Dâun point de vue pratique, nous montrons que lâheuristique du « saut de dimension » pour choisir la constante de pĂ©nalisation est un choix raisonnable lorsque celle-ci est linĂ©aire. Nous montrons Ă©galement quâune quantitĂ© clĂ© dĂ©pendant du noyau et qui apparaĂźt dans nos rĂ©sultats thĂ©oriques influe sur les performances de cette mĂ©thode pour la dĂ©tection dâune unique rupture. Dans un cadre paramĂ©trique, et lorsque le noyau utilisĂ© est invariant partranslation, il est possible de calculer cette quantitĂ© explicitement. GrĂące Ă ces calculs, nouveaux pour plusieurs dâentre eux, nous sommes capable dâĂ©tudier prĂ©cisĂ©ment le comportement de la constante de pĂ©nalitĂ© maximale. Pour finir, nous traitons de lâheuristique de la mĂ©diane, un moyen courant de choisir la largeur de bande des noyaux Ă base de fonctions radiales. Dans un cadre asymptotique, nous montrons que lâheuristique de la mĂ©diane se comporte Ă la limite comme la mĂ©diane dâune distribution que nous dĂ©crivons complĂštement dans le cadre du test Ă deux Ă©chantillons Ă noyaux et de la dĂ©tection de ruptures. Plus prĂ©cisĂ©ment, nous montrons que lâheuristique de la mĂ©diane est approximativement normale centrĂ©e en cette valeur